Conversation
|
R: @dhalperi |
233c099 to
2505d0b
Compare
|
Rebased to fix error produced by the change of Create.of() => Create.empty |
|
R: @jbonofre |
|
First of all the tests have to be fixed. I gonna take a look. |
|
There is something fishy going on, I did mvn clean verify in my machine and everything passes. We need to check if there is something stopping the HBaseTestUtility from been instantiated in jenkins, because the tests that fail are the ones that use the embedded instance. |
|
R: -@jbonofre -- I have already begun a substantial review of this one. Save your time and enjoy your vacation. |
|
@iemejia I suggest that the precommit bugs are at the least identifying places you can improve error catching. Try adding more error reporting so you can see where things are going wrong. |
|
Yes sir ! the issue is that almost all the bugs are null pointer exceptions in the objects that should have been initialized in @BeforeClass, I will log initialization to check if this is going well indeed. |
sdks/java/io/hbase/pom.xml
Outdated
| <dependency> | ||
| <groupId>org.apache.beam</groupId> | ||
| <artifactId>beam-runners-direct-java</artifactId> | ||
| <version>${project.version}</version> |
There was a problem hiding this comment.
drop -- should be inherited
| <properties> | ||
| <hbase.version>1.3.0</hbase.version> | ||
| <hbase.guava.version>12.0.1</hbase.guava.version> | ||
| <hbase.protobuf.version>2.5.0</hbase.protobuf.version> |
There was a problem hiding this comment.
Shade all these dependencies for HBaseIO module itself. Don't leak protobuf or guava dependencies.
There was a problem hiding this comment.
will do, I doubted, but since this is the current policy I will fix it.
| <artifactId>commons-lang</artifactId> | ||
| <version>2.6</version> | ||
| <scope>test</scope> | ||
| </dependency> |
There was a problem hiding this comment.
shade any of these you do not need to expose on the public API surface
There was a problem hiding this comment.
This one is scoped to test so it should not leak. I don't think we need to shade it. Or do you have a strong reason to do it ?
There was a problem hiding this comment.
Ack, you're right, I missed the test
| * .withTableId("table")) | ||
| * .withScan(scan)); | ||
| * | ||
| * // Scan a prefix of the table. |
There was a problem hiding this comment.
"prefix" not quite the right word I think, but good.
There was a problem hiding this comment.
I copy exactly this line from the BigtableIO doc, tell me what you prefer and I will change it.
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L102
| * <pre>{@code | ||
| * Configuration configuration = ... | ||
| * PCollection<KV<ByteString, Iterable<Mutation>>> data = ...; | ||
| * data.setCoder(HBaseIO.WRITE_CODER) |
There was a problem hiding this comment.
Best for the end, this one I really need you to give me some hints.
Notice that to make the API similar to Bigtable, I let this Mutation there, however HBase mutation is not Serializable. So I need to encode it to pass it, but I didn't find how to do it, because if the user is doing it externally he can use a CoderRegistry or do the setCoder in the PCollection.
I don't know how to solve it otherwise because I would need something like the opposite of getDefaultOutputCoder. I mean, like getDefaultInputCoder, and I didn't find this in the API or is there any other way to do this that I am missing, can you give me some ideas ?
There was a problem hiding this comment.
Ha! I literally meant "there's a missing ;".
@kennknowles filed a JIRA recently about registering coders dynamically. As is, something like this may be required, or of course the user adds custom configuration to the pipeline coder registry.
There was a problem hiding this comment.
Hehe ok, can you please refer me to the JIRA number to start watching it and improve this once it is solved.
| if ((scanWithNoLowerBound | ||
| || isLastRegion || Bytes.compareTo(startRow, endKey) < 0) | ||
| && (scanWithNoUpperBound || Bytes.compareTo(stopRow, startKey) > 0)) { | ||
| regionLocations.add(regionLocation); |
There was a problem hiding this comment.
looks right. Commenting for myself later that I need to look at how well this is tested.
There was a problem hiding this comment.
Is there a reason you didn't use the ByteKey and ByteKeyRange functions like overlap and contains to simplify this logic?
There was a problem hiding this comment.
At the begining I didn't want to use both ByteKey and ByteKeyRange because I thought they were related to Bigtable not Beam. I propose you to change this logic afterwards, in a subsequent PR, maybe with the subsplits work, I don't really want to change the current logic.
There was a problem hiding this comment.
No, ByteKey and ByteKeyRange are not Bigtable-specific -- they're specifically added to provide a standard set of common and tricky logic around keys that are byte[].
| || Bytes.compareTo(startKey, startRow) >= 0) ? startKey : startRow; | ||
| final byte[] splitStop = | ||
| (scanWithNoUpperBound || Bytes.compareTo(endKey, stopRow) <= 0) | ||
| && !isLastRegion ? endKey : stopRow; |
There was a problem hiding this comment.
Is there a reason you didn't use the ByteKey and ByteKeyRange functions like overlap and contains to simplify this logic?
| LOG.debug("Closing reader after reading {} records.", recordsReturned); | ||
| if (scanner != null) { | ||
| scanner.close(); | ||
| scanner = null; |
There was a problem hiding this comment.
will this also close the connection allocated in start?
There was a problem hiding this comment.
No, excellent catch, fixing right now.
| @AfterClass | ||
| public static void afterClass() throws Exception { | ||
| admin.close(); | ||
| htu.shutdownMiniCluster(); |
There was a problem hiding this comment.
these should always look like if (admin != null) { admin.close(); admin = null; } that will make your tests more resilient.
| CoderRegistry cr = TestPipeline.create().getCoderRegistry(); | ||
| cr.registerCoder(Mutation.class, HBaseMutationCoder.class); | ||
| } catch (IOException e) { | ||
| e.printStackTrace(); |
There was a problem hiding this comment.
Why this instead of throw?
2505d0b to
8c1d79f
Compare
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 2beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 2
--none-- |
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
|
This looks to me like HBase is trying to bind an internet-public IP for
listening, which is likely not allowed. It should be binding something like
`localhost`. Is this something you can override in the test config?
…On Mon, Feb 13, 2017 at 9:34 AM, asfbot ***@***.***> wrote:
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/7357/
Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:
beam-sdks-java-io-hbase
<https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/7357/org.apache.beam$beam-sdks-java-io-hbase/testReport>:
1
-
*org.apache.beam.sdk.io.hbase.HBaseIOTest.org.apache.beam.sdk.io.hbase.HBaseIOTest*
<https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/7357/org.apache.beam$beam-sdks-java-io-hbase/testReport/org.apache.beam.sdk.io.hbase/HBaseIOTest/org_apache_beam_sdk_io_hbase_HBaseIOTest/>
--none--
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1961 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAgIT_1yzmfpWs87Kc_PB3f20u5y6OEgks5rcJQvgaJpZM4L8Sg9>
.
|
|
Arrghh I was lucky to be close to 1 test to pass :) |
jbonofre
left a comment
There was a problem hiding this comment.
It looks better. Maybe the binding host as a static var is better.
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
| public static void beforeClass() throws Exception { | ||
| LOG.info("Starting HBase Embedded Server (HBaseTestUtility)"); | ||
| System.out.println("Starting HBase Embedded Server (HBaseTestUtility)"); | ||
| conf.setInt(HConstants.HBASE_CLIENT_RETRIES_NUMBER, 1); |
There was a problem hiding this comment.
I think that ultimately what you want is to, in your configuration, set the HBase server name to localhost or 127.0.0.1
|
Now that I'm back from vacation, I will help for the review with Dan. |
|
Refer to this link for build results (access rights to CI server needed): Build result: FAILURE[...truncated 6325 lines...] at hudson.remoting.UserRequest.perform(UserRequest.java:153) at hudson.remoting.UserRequest.perform(UserRequest.java:50) at hudson.remoting.Request$2.run(Request.java:332) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.MojoFailureException: You have 3 Checkstyle violations. at org.apache.maven.plugin.checkstyle.CheckstyleViolationCheckMojo.execute(CheckstyleViolationCheckMojo.java:588) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-02-17T09:25:55.980 [ERROR] 2017-02-17T09:25:55.980 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-02-17T09:25:55.980 [ERROR] 2017-02-17T09:25:55.980 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-02-17T09:25:55.980 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-02-17T09:25:55.981 [ERROR] 2017-02-17T09:25:55.981 [ERROR] After correcting the problems, you can resume the build with the command2017-02-17T09:25:55.981 [ERROR] mvn -rf :beam-sdks-java-io-hbasechannel stoppedSetting status of 3e9c921 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/7534/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install--none-- |
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
|
Refer to this link for build results (access rights to CI server needed): Failed Tests: 3beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-runners-spark: 2
beam_PreCommit_Java_MavenInstall/org.apache.beam:beam-sdks-java-io-hbase: 1--none-- |
|
I took a pass on the HBase Configuration in #2036. The changes I made to the test seem to have worked -- the most recent Jenkins run number 7544 actually tested and passed HBaseIOTest. [Jenkins build failed overall because of the awful hacks I did to the poms to streamline testing.] Please take a look -- I hope this unblocks you getting this green! |
|
https://github.com/apache/beam/pull/2036/files/cbf29331a53102680fcbd468757f8521c8292cff..8f2be281b84d3d47e5d7db8bf7c1f8d20013e209 was my specific hacks. (ignore anything outside of HBaseIOTest) |
|
Refer to this link for build results (access rights to CI server needed): |
|
Finally it passed ! |
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> | ||
| <artifactId>maven-jar-plugin</artifactId> | ||
| </plugin> |
There was a problem hiding this comment.
I think most of these should be inherited. Are they needed?
There was a problem hiding this comment.
Yes, they should be put explicitly, all the other IOs do the same.
There was a problem hiding this comment.
I do not think they are needed. Please look at https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/pom.xml
The only place we have them is in pluginManagement, to override config from parent.
There was a problem hiding this comment.
I have prepared a subsequent PR to remove the need to import these three maven plugins from all the IOs, but I am hoping for this one to get merged to open it.
There was a problem hiding this comment.
Hmm, clearly I'm crazy. Thank you.
| for (Result result : scanner) { | ||
| results.add(result); | ||
| } | ||
| System.out.println(estimatedBytesSize); |
| htu = new HBaseTestingUtility(conf); | ||
| htu.startMiniCluster(1, 4); | ||
| admin = htu.getHBaseAdmin(); | ||
| System.out.println("Started HBase Embedded Server (HBaseTestUtility)"); |
There was a problem hiding this comment.
Oh that was a mistake from trying to do jenkins happy, thanks for noticing.
| /** | ||
| * This is just a wrapper class to serialize HBase {@link Scan}. | ||
| */ | ||
| public class SerializableScan implements Serializable { |
There was a problem hiding this comment.
Are these intended for use outside of the hbase module, or only inside of the HBaseIO module? If the latter, I would not make them public.
There was a problem hiding this comment.
They are not, however as they live in the coders package, I have to make them public, otherwise I cannot import them. One approach would be to move them to the same directory but I think this makes the code more messier (even if I agree with you about the desired access level).
8eb46f1 to
1aee6cc
Compare
|
Ok, fixed last remarks,and rebased into one single commit, I hope it is ok right now so it can be merged soon. |
|
Refer to this link for build results (access rights to CI server needed): |
1aee6cc to
a1ad2a4
Compare
|
Notice that this extra rebase was to add the IO to the parent pom and the javadoc pom. |
|
Refer to this link for build results (access rights to CI server needed): |
|
I already updated the other IOs. So, please, don't update the other IO in another PR. |
|
Merged! Can you file JIRAs for follow-on work (adding support for Dynamic work rebalancing, improving splitting, switching to ByteKey, ByteKeyRange, ByteKeyRangeTracker to help the first 2)? |
|
Awesome, thanks a lot Dan for your review + the binding fix, I will be filling the JIRAs now. |
|
JB, as discussed, I will do the update to remove extra plugins for the other IOs, thanks for letting me that one. |
|
Sure ! Thanks ! I set my branch "on hold" ;) |
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
[BEAM-<Jira issue #>] Description of pull requestmvn clean verify. (Even better, enableTravis-CI on your fork and ensure the whole test matrix passes).
<Jira issue #>in the title with the actual Jira issuenumber, if there is one.
Individual Contributor License Agreement.